Information Extraction from the Weather Reports in Serbian

نویسندگان

  • Stasa Vujicic Stankovic
  • Vesna Pajic
چکیده

In this paper, we describe a process of extracting information from meteorological texts in Serbian. The text corpus consists of almost 46000 sentences. Having in mind the specifics of Serbian and characteristics of meteorological sublanguage, we develop a classification schema for structuring extracted information and transducers for annotating pieces of information in the text corpus. We describe the transducer for extracting information about daily temperatures and give some evaluation parameters for all other transducers used in the information extraction process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending the Ist-world Database with Serbian Research Publications

This paper describes an effort of using knowledge technologies to gain insights into research activity, by exploiting publicly available information on research publications. The specificity of this paper is extending the existing IST World database with information on Serbian research publications. We describe the process of information extraction applied in order to fill in the database, base...

متن کامل

An Abstract of the Project Report Of

approved: Prasad Tadepalli Severe weather in the United States causes huge insured losses to crop and property frequently. It creates major impact and elicit diverse response in the weather insurance industry. Events like hail, storm, hurricane etc. are more likely to cause catastrophe losses. So it becomes crucial to collect and analyze these extreme weather information. Thus, weather data col...

متن کامل

WI&CRF: روش پیشنهادی برای استخراج اطلاعات مورد نیاز از متون نظامی

Military Information Extraction techniques are interested for military managers and commanders. But usual information extraction techniques cannot be used for that domain, because military corpus has special structure that differs from non-military corpus. In this paper the military documents structure is compared with non-military documents structure. Moreover a new classification is proposed ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012